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1  Introduction 


In  a  computer  network  where  a  set  of  processors  wish  to  perform  some  computational 
task,  communication  can  sometimes  become  a  bottleneck,  especially  when  communication 
resources  are  scarce.  This  is  particularly  so,  in  the  area  of  parallel  and  VLSI  computation 
-(see  o.g.,  [BT~89],  [U B5]%  where  communication  issues  have  been  studied  extensively.  In 
such  contexts,  it  is  desirable  to  design  algorithms  that  require  as  little  information  exchange 
as  possible.  Problems  of  minimizing  the  amount  of  exchanged  information  also  arise  in 
the  context  of  decentralized  signal  processing,  where  each  local  processor  collects  some 
partial  data  to  be  processed  collectively.  In  this  paper,  we  study  the  “communication 
complexity^i.e.,  the  minimum  possible  amount  of  information  exchange)  of  some  particular 
computational  tasks.  \ 

—  Generally  speaking,  communication  complexity  depends  both  on  the  topology  of  a  com¬ 
puter  network  and  on  the  nature  of  the  computational  task  under  consideration.  In  this 
paper,  we  ignore  the  topological  issues,  by  assuming  that  there  are  only  two  processors,  say 
Pi  and  ■££  We  use  the  following  model  of  communications  introduced  by  Abelson  ([A  80]). 
Let  there  bXgiven  a  continuously  differentiable  function  /  :  Dx  x  Dv  i-+  5?,  where  DT  and  Dv 
are  some  open  subsets  of  and  5Rn  respectively.  It  is  assumed  that  processor  P\  (respec¬ 
tively,  Pt)  has  access  to  a  vector  x  €  Dx  (respectively,  y  €  Dy)  and  the  formula  defining  /. 
The  processors  Pi,  Pi  proceed  to  evaluate  /(z,y)  by  exchanging  messages,  using  a  two-way 
communication  protocol,  in  which  messages  can  be  sent  in  both  directions.  Let  us  use  x 
to  denote  a  two-way  communication  protocol  and  r(x)  to  denote  the  number  of  messages 
exchanged  in  x.  In  addition,  let  T\->i  (respectively,  Ti~,\)  denote  the  set  of  indices  t  for 
which  the  i-th  message  is  sent  from  P\  to  Pi  (respectively,  from  Pi  to  P{).  The  protocol 
x  consists  of  r(x)  functions  rnj, . . . , :  Dz  x  Dv  >-*  5f,  with  m,(x,y)  being  interpreted 
as  the  value  of  the  i-th  message.  These  message  functions  must  depend  on  the  inputs  x 
and  y  in  a  very  special  way.  More  specifically,  for  each  i,  there  must  exist  some  real-valued 
function  rhj  such  that  '■'''P 

m,(x,y)  =  mt  (x,m1(x,y),...,mf_1(x,y)),  V(x, y)  €  Dx  x  Dv,  if  i  G  7Y_j,  (1.1) 


mi(x,y)  =  fh,  (y,m1(x,y),...,m,_i(x,y)),  V(x, y)  €  Dt  x  Dv,  if  t  €  Tj^i.  (1.2) 

Furthermore,  we  require  that  either: 

a)  There  exists  a  function  h  such  that 

/(x,y)  =  h(x,m1(x,y),...,mr(,)(x,y))  ,  V(x,y)  €  Dt  x  Dv,  (1.3) 
(this  corresponds  to  the  case  where  processor  P\  performs  the  final  computation)  or, 
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(1.4) 


b)  There  exists  a  function  h  such  that 

f{x,y)  =  h  (y,m1(x,y),...,mf(lr)(x,y))  ,  V(x,y)  gD,x  Dv , 
which  corresponds  to  the  case  where  processor  P2  computes  the  final  result. 


l^pically,  some  smoothness  constraints  are  imposed  on  the  functions  ny,  try  and  h. 
For  example,  (A  80]  considers  the  class  of  two-way  communication  protocols  (denoted  by 
II j(/;  D*  x  Dy))  in  which  the  functions  ny,riy  and  h  are  twice  continuously  differentiable. 
In  this  paper,  we  consider  a  more  general  class  of  protocols  in  which  the  message  functions 
ny,  ny  are  once  continuously  differentiable  and  the  final  evaluation  function  h  is  continuous. 
We  denote  this  class  of  two-way  protocols  for  computing  /  by  IIj (f]Dx  x  Dv).  We  define 
the  two-way  communication  complexity  of  computing  /  with  protocols  in  IT2  [f\Dx  x  Dv) 
as 


^2  (/ !  Dx  X  Dy)  — 


inf  r(;r). 
wen  2(f,DMxD,) 


We  define  the  quantity  Cj(/;  Dx  x  Dv )  similarly.  Notice  that  TT2(/;  Dz  x  Dy)  C  Ill  (/ ;  Dx  x 
Dy).  Thus,  Ci(f;Dx  x  Dv)  >  Ci(f\Dx  x  Dv).  As  discussed  in  [L  89],  x  Dv)  is, 

in  some  sense,  the  most  general  class  of  protocols  for  which  the  notion  of  communication 
complexity  is  well  defined  for  problems  involving  continuous  variables.  [L  89]  also  contains 
a  discussion  of  how  to  implement  in  practice  the  “continuous”  communication  protocols 
whose  messages  are  real  numbers  by  using  binary  strings. 


The  most  fundamental  work  on  two-way  communication  complexity  is  due  to  Abelson 
([A  80])  who  established  a  general  lower  bound  for  C2{f',Dz  x  Dv).  In  particular,  let 
/  :  Dx  x  Dy  8?  be  a  twice  continuously  differentiable  function  and  let  Hxy(f)  denote  the 
matrix  (of  size  m  x  n)  whose  (»,  j)-th  entry  is  given  by  ■  The  following  result  was 

proved  in  [A  80] : 


Theorem  1.1  For  any  p  €  Dx  v,  we  have 

Ciifl  Dx  x  Dy)  >  rank  (Hxv{f))  (p). 

Notice  that  Theorem  1.1  only  takes  into  account  the  second  order  derivatives  of  /  and 
ignores  the  derivatives  of  other  orders.  Thus,  this  bound  should  not  be  expected  to  be 
tight,  as  was  shown  in  [LT  89]. 

In  this  paper,  we  derive  a  new  general  lower  bound  which  is  different  from  Theorem  1.1. 
Our  result  (Theorem  2.1)  makes  use  of  the  first  order  derivatives  of  /  and  is  fairly  intuitive, 
but  surprisingly  difficult  to  prove.  Our  work  was  motivated  from  the  problem  of  distributed 
computation  of  a  root  of  a  polynomial  equation  of  degree  n  -  1.  We  apply  our  result  to  this 


problem  and  obtain  a  lower  bound  of  n,  in  contrast  to  the  0(1)  lower  bound  obtained  from 
Abekon’s  result.  In  [L  89],  a  similar  O(n)  lower  bound  is  established  for  the  same  problem, 
but  under  a  more  restricted  class  of  communication  protocols  in  which  the  functions  m<,  m, 
(*  =  1, . . .  ,r(x).)  are  assumed  to  be  polynomials.  The  proof  in  (L  89]  makes  use  of  a  result 
from  dimension  theory  and  is  algebraic  in  nature,  in  contrast  to  the  analytic  approach  in 
the  proof  given  here. 

In  related  work  ([LT  89]),  Abelson’s  result  has  been  extended  by  considering  more  re¬ 
stricted  class  of  communication  protocols;  in  particular,  some  improved  lower  bounds  on 
one-way  and  two-way  communication  complexity  have  been  obtained  by  exploiting  the  al¬ 
gebraic  structure  present  in  certain  problems.  Communication  complexity  has  also  been 
studied  under  discrete  communication  models  (see  e.g.  [MS  82],  [PS  82],  [PT  82],  [Y  79]). 
In  these  models,  the  messages  are  no  longer  real  numbers,  but  binary  strings.  A  substan¬ 
tial  amount  of  research  has  been  devoted  to  the  study  of  the  communication  complexity  of 
selected  combinatorial  problems  ([AU  83],  [PE  86],  [U  84]).  A  different  model  is  introduced 
in  [TL  87]  for  the  problem  of  approximately  minimizing  the  sum  of  two  convex  functions 
under  the  assumption  that  each  convex  function  is  known  to  a  different  processor. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2,  we  prove  our  main  result 
(Theorem  2.1).  In  Section  3,  we  apply  the  result  of  Section  2  to  establish  a  lower  bound  of 
n  for  the  problem  of  computing  a  root  of  a  polynomial  equation  of  degree  n  -  1.  In  Section 
4,  we  compare  our  result  with  that  of  Abelson’s.  Finally,  the  appendix  contains  certain 
results  from  multidimensional  calculus  that  are  needed  in  Section  2. 


2  Main  Result 

Let  /  :  D*  x  Dy  i-»  SI  be  a  continuously  differentiable  function,  where  Dx  and  Dv  are  some 
open  subsets  of  5Rm  and  5Rn,  respectively.  We  use  the  notation  V«/(x,y)  and  Vv/(x,y)  to 
denote  the  m-dimensional  (respectively,  n-dimensional)  vector  whose  components  are  the 
partial  derivatives  of  /  with  respect  to  the  components  of  x  (respectively,  y).  Also,  for  any 
set  5  C  Dx,  we  use  [Vv/(x,y);x  €  5]  to  denote  the  subspace  of  R"  spanned  by  the  vectors 
Vy/(*,y),  z  €  5.  Finally,  for  any  set  S  C  Dy,  [V,/(z,  y);y  €  5]  is  similary  defined. 

Assumption  2.1  For  any  y  €  Dv,  we  let 

$(y)  =  {  S  c  Dx  [  /(S,  y)  contains  an  open  interval  }.s 
( For  any  x  €  Dx,  S(x)  is  similarly  defined.) 

*The  notation  f{S,y)  stands  for  th«  set  {/(*,  y)  ]  z  €  S).  Similar  notation  will  be  need  later  without 
further  comment. 
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a)  For  any  y  €  Dv  and  any  nonempty  open  set  S  C  Dx,  we  have  S  £  $(y). 

b)  For  any  x  £  Dx  and  any  nonempty  open  set  S  C  Dy,  we  have  S  £  $(y). 

c)  For  some  nonnegative  integer  n /,  we  have 


dim[Vv/(x,  y);x  £  Sj  >  n/, 

Vye  Dv,  VS  e  S(y). 

(2.1) 

d)  For  some  nonnegative  integer  m j,  we  have 

dim[V,/(x,y);y  £  5]  >  m/, 

Vx  €  Dz,  VS  £  S(x). 

(2.2) 

Our  main  result  is  the  following: 

Theorem  2.1  There  holds 

C\{f\Dx  x  Dy)  >  min {nf)mf}.  (2.3) 

Proof:  Let  r  =  Ci(f  \  Dxx  Dv).  We  first  prove  that  it  is  sufficient  to  show  the  lower  bound 

(2.3)  under  the  additional  assumption 

r  =  jnin  C\{f\Dx  x  Dv),  (2.4) 

DX 

where  the  minimum  is  taken  over  all  nonempty  open  subsets  Dx,  Dv  of  Dx,  Dv,  respectively. 
In  fact,  suppose  that  we  have  already  shown  that  Theorem  2.1  is  true  under  the  assumption 

(2.4) .  Let  us  now  show  (2.3)  when  Eq.  (2.4)  does  not  hold.  In  this  case,  there  exists  some 
r'  <  r  and  some  open  subsets  Dx  x  Dv  of  Dx  x  Dv  such  that 

r'  =  Cj(/;  Dx  x  Dv)  =  _min  Ci(f\Dx  x  Dv). 

Dx<Dx 

where  the  minimum  is  taken  over  all  nonempty  open  subsets  Dx,  Dv  of  Dx,  Dv.  Thus,  Eq. 

(2.4)  holds  with  r,  Dx  and  Dv  replaced  by  r',  Dx  and  Dv  respectively.  Since  any  nonempty 
open  subset  of  Dx  (respectively,  Dv)  is  also  a  nonempty  subset  of  Dx  (respectivley,  Dy),  we 
see  that  Assumption  2.1  remains  valid  (with  the  same  constants  n/,  m/)  when  Dx,  Dt  are 
replaced  by  t)x,  Dv.  Therefore,  Theorem  2.1  applies  and  shows  that  r  >  r1  >  min{n/,my}, 
which  shows  that  Theorem  2.1  holds  regardless  of  assumption  (2.4). 

In  the  rest  of  the  proof,  we  will  assume  that  (2.4)  holds.  Let  us  consider  a  protocol  that 
uses  exactly  r  messages,  described  by  (cf.  Section  1) 

m,(i,y)  =  m,(*,m1(*,y),...,mf_1(*,y)),  V(x,y)  €  Dx  x  Dv,  if  i  £  T^,  (2.5) 
mj(x,y)  =  mi(y,mi(z,y),...,m1_i(x,y)),  V(x,y)  €  Dx  x  Dv,  if  *  €  Tj-,1,  (2.6) 
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where  each  m*  and  m,  is  a  continuously  differentiable  function.  Without  loss  of  generality, 
we  assume  that  the  final  evaluation  of  /  is  performed  by  processor  P\.  Thus,  there  exists 
some  continuous  function  h  such  that 


f(x,y)  =  h(x,mi(x,y),...,mr{x,y)),  V(z, y)  €  Dx  x  Dv.  (2.7) 

We  introduce  some  notations.  Let  u  =  (z,y)  and  let  D  =  Dx  x  Dv.  Let  also  m(u)  = 
(mi(u), . . .  ,mr(ti))  and  let  Vm(u)  be  the  (m  +  n)  x  r  matrix  whose  t-th  column  is  the 
gradient  vector  Vm,(u), «  =  1, . . . ,  r.  Define 

fc  =  max  rank  [Vm(u)] .  (2.8) 


Lemma  2.1  k  =  r. 

Proof:  We  show  this  by  contradiction.  Suppose  that  r  >  k.  Consider  the  continuously 
differentiable  mapping  m  :  D  >-*  8?r,  where  D  =  Dx  x  Dv  is  an  open  set  and  m(u)  = 
(mi(u), . . . ,  mr(u)).  We  claim  that  Vm^i,  y)  is  not  identically  zero  on  the  set  D.  Indeed, 
if  this  was  the  case,  then  m\(x,y)  would  be  equal  to  a  constant  on  the  set  D,  and  the  first 
message  in  the  protocol  would  be  redundant.  Thus,  there  would  exist  a  protocol  that  uses 
r  -  1  messages,  contradicting  the  definition  of  r.  We  can  therefore  apply  Theorem  A. 2  in 
the  appendix  (with  the  correspondence  m  F,  D  *-*  Q,  r  *-*  s)  to  conclude  that  there 
exists  some  positive  integer  i  and  some  continuously  differentiable  function  g  such  that 

m,+i(u)  =  y(m1(u),...,rrj,(u)),  Vu  G  D,  (2.9) 

where  D  is  some  nonempty  open  subset  of  D.  By  taking  a  subset  of  D  if  necessary,  we  can 
assume  that  D  is  of  the  product  form  D*  x  Dv,  where  Dx  and  Dv  are  some  open  subsets  of 
Dz  and  Dv  respectively.  Then,  Eq.  (2.9)  would  imply  that  the  (i+  l)-st  message  m,+i(i,y) 
is  redundant  for  computing  /  over  Dz  x  Dv,  which  contradicts  the  definition  of  r  (cf.  Eq. 
(2.4)).  Q.E.D. 

Loosely  speaking,  Lemma  2.1  tells  us  that  each  message  in  an  optimal  protocol  has 
to  contain  some  "new  information”  and  therefore  the  corresponding  gradient  vectors  have 
to  be  linearly  independent.  Before  we  go  on  to  the  next  lemma,  we  introduce  some  more 
notations.  Let  U,  c  D„  C  Dv  be  nonempty  open  sets  such  that  Vm(u)  has  full  rank 
for  every  u  €  x  Dv.  (Such  sets  can  be  taken  nonempty  due  to  Lemma  2.1,  and  open 
due  to  the  continuity  of  Vm(u).)  We  use  D  as  a  short  notation  for  Dz  x  Dv.  Furthermore, 
for  any  vector  c  =  (ci,...,cr)  €  !Rr,  we  let  c*  =  (cj, cj, ... ,c<).  Let  also  ri  (respectively, 
T})  be  the  number  of  messages  sent  by  processor  Pi  (respectively,  Pi).  In  addition,  we  use 
the  notation  [V,m,(z,  y);i  G  T'i—.j]  to  denote  the  mxri  matrix  whose  column  vectors  are 
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V,m*(x,y)  =  (i^(x,y),...,§~^(x,y)),  i  £  7i_a.  The  nxr2  matrix  [Vymj(x,y);i  €  r2_,] 
is  defined  similarly.  As  a  refinement  of  Lemma  2.1,  we  have  the  following: 


Lemma  2.2  For  any  (x,  y)  £  D,  then  holds 

rank [V *m»  (x,  e*-1 ) ;  i  £  7i_2]  =  ri, 


and 

ranklVym^y.e’-1);:  £  T2_i]  =  r2, 

where  c  —  m(x,y). 

Proof:  By  Lemma  2.1,  we  see  that  the  matrix  Vm(x,  y)  has  full  rank  (and  its  rank  is  equal 
to  r)  over  the  set  D.  Notice  that  by  possibly  reindexing  the  columns  of  the  matrix  Vm(x,y) 
we  can  write  Vm(z,y)  in  the  form 


where  An  =  [Vzmj(x,  y);t  £  Ti— 2]  and  A22  =  [V„m<(x,y);«  £  T2_i].  From  Eqs.  (2.5)- 
(2.6),  it  is  easily  seen  that  for  each  i  €  T2_i,  there  exists  a  continuously  differentiable 
function  Af,  such  that 

m%(x,  y)  =  Mi  (y,{mj(x,  y)  :/<«,/€  7i_2})  ,  «  £  T2_1.  (2.10) 

(In  other  words,  a  message  sent  by  processor  Pj  can  be  expressed  as  a  function  of  y  and 
the  messages  already  received.)  By  differentiating  Eq.  (2.10),  we  obtain 

V,m,(z,y)=  ^2  di{x,y)Vxmi[x,y),  i  £  T2_i  (2.11) 

I€Ti_2 

where  each  di(x,y)  is  a  suitable  scalar.  Thus, 

V«mj(x,y)  €  span  {V«mj(z,y)-,i  €  Tj_i>,  V(x,y)  £  D,  Vi  €  Tj_i. 

This  means  that  the  columns  of  An  belong  to  the  span  of  the  columns  of  An  and  therefore 

rank  [  An  An  ]  =  rank  (An)  < 

Similarly,  one  can  show  that 

rank  [  A2i  A22  J  =  rank(A22)  <  r2. 


Vm(x,y)  = 


An  -An 

Mi  Ml 
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On  the  other  hand, 


=  ry  +  f2 
>  rank  (An) 
=  rank 


rank (A22) 


>  rank 


An  A12  ]  +  rank  [  A21  A22  ] 


An  A12 
A21  A22 
=  rank[Vm(x,y)| 

=  r,  V(x,y  )eD. 


This  implies  that 

rank(An)  =  ranklVxm^x.y);!  €  T1—2]  =  ri 

and 

rank(A22)  =  rank(Vvm»(x, y);  1  £  72_i]  =  r2. 

To  show  that  rank^m, (x,  c* ~ 1 ) ;  i  £  Ti_2j  =  ri>  we  differentiate  Eq.  (2.5)  to  obtain 


V,m,(i,y)  =  VImi(x,c'  x)  +  |^(x,c‘-1)V*mj(x,y), 

(=1  dmi 


if  «eTi_2, 


(2.12) 


where  c  =  m(x,y)  and  ( x ,  y)  €  D.  Using  Eq.  (2.11),  we  see  that  X)j“J  §^jj(x,c,-1)Vtmi(x,y) 
can  be  written  as  a  linear  combination  of  the  vectors  {Vsmi(x,y);  /  <  i  -  1,  /  €  Ti_2}- 
Therefore,  Eq.  (2.12)  shows  that 


[VIm,(x,c’_1);i  €  7i_2]  =  [V,m,(x,y);i  £  Ty-.2]C  =  AyyC, 


where  C  is  some  upper  triangular  matrix  whose  diagonal  entries  are  equal  to  1.  Hence 
rank[Vxm,(x, c'_1);i  €  7i_2]  =  rank(An)  =  ry.  The  equality 

rank[V„mj(y, c,_1);i  €  T2—i)  =  r2 

can  be  shown  by  a  similar  argument.  Q.E.D. 

Let  us  fix  some  more  notations.  For  any  vector  c  =  (cy, . . . ,  cr)  £  Rr,  we  let 

5(c)  =  {  (z,f/)€  D,x  D,  \  mi(z,y)  =  Cj,  i=l,...,r), 

S*(c)  =  {  x  £  D*  |  m,(x,c'-1)  =  c„  Vi  £  Ty->3  },  (2.13) 

Sv(c)  =  {  y  €  Dv  |  *ni(y,c’-1)  =  c„  V.’  £  T2-,  }, 

R  =  {  (m1(xIy),...Imr(x,y))  |  (x,y)  €  D*  X  Dv  >. 
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Lemma  2.3  For  any  c  £  R,  we  have 

S(c)  =  Sx{c )  x  Sy(c).  (2.14) 

Proof:  We  have,  using  the  definition  (2.13)  and  Eqs.  (2.5)-(2.6), 

S{c)  =  {  (*,y)  €  Dx  x  Dv  |  m,(r,e’_1)  =  c,,  Vi  £  Ti^2, 

mi(y,e,_1)  =  c<,  V»  €  Tj-i  } 

=  ^*(c)  *  ^(c)- 

Q.E.D. 

We  now  fix  some  (x',y*)  £  D  and  let  c*  =  m(z*,y*).  Consider  the  mapping  F  with 
components 

Fi{x,c)  =  m,(x,c‘-1)  -  c,,  Vc  €  P,  x£  Dx,  i  £  T^2. 

Thus,  Fj(x’,c* )  =  0,  for  all  *  €  Ti_j.  Moreover,  it  follows  from  Lemma  2.2  that  the  matrix 
[VfF(x*,c*)]  has  full  rank.  It  is  now  clear  that  we  are  in  a  position  to  apply  Theorem  A. 3 
in  the  appendix  (with  the  correspondence  u  *-*  x  and  u  «->  c)  to  conclude  that  there  exist 
an  open  subset  U\  of  5 Cr  containing  c*,  an  open  subset  Dx  of  Dx  containing  x*  such  that 
5j(c)f)0,  is  nonempty  and  connected  for  ail  c  €  U\.  Following  a  symmetrical  argument, 
we  see  that  there  exist  open  subsets  U2  C  3T  and  Dv  C  Dv  such  that  c*  £  U2,  y*  G  Dv, 
and  5y(c)nf?v  *8  nonempty  and  connected  for  all  c  £  U2.  Let  U  =  U\  fil/j.  Clearly,  U  is 
nonempty  since  c*  €  U .  In  light  of  Lemma  2.3,  we  see  that  for  all  c  £  U, 

S(c)  =  S(c)0(bxxDv) 

=  (sx(c)f]bt)x(sy(c)f]bv), 

and  the  set  5(c)  is  nonempty  and  connected.  Let  us  use  Sx(c)  and  Sy(c)  to  denote  the  sets 
S*(c)nAc  and  Sy(c)n^y  respectively. 

We  now  proceed  to  the  main  part  of  the  proof.  Since  we  have  assumed  that  the  final 
result  is  evaluated  by  processor  P\ ,  it  follows  that  the  last  message  mr(x,y)  must  have  been 
sent  by  processor  P2.  (Otherwise,  processor  Pj  would  be  able  to  evaluate  /(x,  y)  on  the 
basis  of  mi(x,y), . . . ,mr_i(x,y),  and  we  would  have  a  protocol  with  r  -  1  messages,  thus 
contradicting  Eq.  (2.4).)  Suppose  that  there  exists  some  function  w  :  U  ►-*  R  such  that 

h(z,c)  =  u>(c),  Vc  €  U,  Vx  £  Sx(c),  (2.15) 

where  h  is  the  function  given  by  Eq.  (2.7).  We  claim  that  u;  is  a  continuous  function  of  c 
in  U.  In  fact,  let  c  be  an  arbitrary  vector  in  U  and  let  {c,-  €  U;i  =  1,2, . . .}  be  a  sequence 
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of  Sectors  converging  to  c.  By  Theorem  A. 3  in  the  appendix,  we  can  pick  a  convergent 
sequence  of  vectors  { n  G  Sz{ci),i  —  1,2, . . .}  such  that  lim^-oo  x ,  =  x  for  some  x  €  Dz.  By 
using  Eq.  (2.15)  and  the  continuity  of  h,  we  see  that 

lim  u/(c,)  =  lim  h(x,,  c,)  =  h(x,c)  =  u>(c), 

I— »00  f-40O 

which  implies  that  w  is  continuous  on  U.  Since  for  any  (x,y)  €  m_1(£/)n  px  x  by^j  we 
have  m(x,y)  G  U,  Eq.  (2.15)  yields 

/(x,  y)  =  h(x,  m(x,  y))  =  w{m{x,  y)),  V(x,  y)  G  Dz  x  Dy. 

Thus,  /  can  be  evaluated  on  the  basis  of  m(x,y)  alone  over  the  set  m_1(f/)n  ^ Dz  x  Dp'j 
and  this  can  be  done  by  processor  Pj  before  sending  the  last  message.  Thus,  Eq.  (2.15) 
leads  to  a  protocol  with  r  -  1  messages  for  computing  /  over  m"  1  (t/)  f)  ( Dz  x  D v)  .  This 

will  contradict  Eq.  (2.4)  once  we  show  that  m-1(J/)fi  (j)z  x  D^j  is  a  nonempty  open  set. 
To  this  effect,  we  notice  that  S(c)  is  nonempty  and  that 

S(c)  C  m~1(t/)P'|  (.D*  x  Dy'j  ,  Vc  €  U, 

from  which  it  follows  that  m~1(U)f]  (pz  x  £>„)  is  nonempty.  Furthermore,  m_1(J7)  is 
open  since  it  is  the  inverse  image  of  the  open  set  U  under  a  continuous  mapping.  Thus, 
»”_1(^)n  (px  x  is  open,  since  Dz  x  Dy  is  open  by  construction. 

Since  no  function  vu  can  have  the  property  (2.15),  we  conclude  that  there  exists  some 
c  G  U  such  that  h(x,  c)  is  a  nonconstant  function  of  x  on  the  set  S,(c).  Since  h  is  a  continuous 
function  and  the  set  Sz(c)  is  nonempty  and  connected,  we  see  that  h  (jSz(c),c)  must  contain 
an  open  interval  in  5R.  Using  the  fact  that  /(x,y)  =  h(x,c)  for  all  (x,y)  G  S*(c)  x  5„(c),  we 
have 

f(Sz(c),y)  =  h(sz(c),c ),  Vy  G  Sy(c). 

Therefore,  f(Sz(c), y)  contains  an  open  interval,  or  equivalently,  5,(c)  G  S(y)  for  all  y  G 
Sy(c)  (cf.  Definition  2.1).  Let  us  fix  some  y  6  Sv(c).  Then,  using  the  definition  of  nj  (Eq. 
(2.1)),  there  exist  x1,.--.*"'  G  S«(c)  such  that  Vr/(x*,  $),...,  V*/(xn/,y)  are  linearly 
independent.  Meanwhile,  we  observe  that 

5„(c)  =  {  y  G  by  |  m.ty.c’-1)  =  c,,  Vi  G  T2_i  } 

and  that,  for  any  fixed  x  G  5*(c),  /(x,  y)  =  h(z,e)  is  a  constant  function  of  y  on  the  set 
Sv(c).  Moreover,  by  Lemma  2.2,  we  have 

rank[Vvm,(y, c*— 1 ) ;  t  G  T2_i]  =  r2,  Vy  G  (2-16) 
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Thus,  we  are  now  in  a  position  to  apply  Theorem  A. 4  (with  the  correspondence  A  «-»  Sv{c), 
F  «-*  (m^y.c*-1)  -  Cj;»  G  T2-.1})  and  conclude  that 

V„/(x,y)  €  span  {V„mi(y,c*_1),  t  G  T2_ 1},  Vz  €  Sx(c). 

Since  each  x *  €  ^(c),  we  see  that  Vv/(z},  y)  is  in  the  span  of  the  vectors  {  Vym,(y,  c*-1),  i  € 
Ti~.\},(oTj  =  /.  Using  the  fact  that  the  vectors  Vy/(xJ  ,y)  are  linearly  independent, 

we  conclude  that  r  >  r2  >  n/  >  min {m/,n/}  which  is  the  desired  result,  under  the 
assumption  that  processor  P\  performs  the  final  evaluation  of  /.  A  similar  argument  yields 
r  >  ri  >  n/  >  min{m/,n/}  for  the  case  where  processor  P2  performs  the  final  evaluation 
of  /.  Q.E.D. 

As  a  remark,  we  notice  that  in  the  preceding  proof  we  have  actually  shown  that  r2  >  nj 
in  the  case  where  processor  Pj  performs  the  final  computation  and  rj  >  m/  if  processor  P2 
performs  the  final  computation.  Therefore,  if  C\{f  \Dx  x  Dv )  =  min{m/,n/},  then  either 
ri  =  rrif  and  r2  =  0,  or,  =  0  and  r2  =  nj.  This  means  that  our  lower  bound  is  tight  only 
for  those  problems  for  which  one-way  communication  protocols  are  optimal . 

Corollary  2.1  If  C\[f  \DzxDv)  =  min{n/,m/},  then  any  optimal  communication  protocol 
for  computing  f  over  Dz  x  Dv  is  necessarily  an  one-way  communication  protocol. 


3  Computing  a  Root  of  a  Polynomial 

We  now  apply  Theorem  2.1  to  the  distributed  computation  of  a  root  of  a  polynomial.  We 
shall  demonstrate  that  in  this  case  Abelson’s  result  is  far  from  being  optimal. 

Let  x  =  (z0,  ...,xn_i)  €  5Rn  and  y  =  (yo,  •  •  •  ,Vn-i)  S  9Rn;  let  F{z\x,  y)  be  the  polyno¬ 
mial  in  the  scalar  variable  z  defined  by 

n-l 

F{*;  *.y)  =  +  *)*’>  (31) 

i=0 

Processor  P2  (respectively,  P2)  has  access  to  the  vector  x  (respectively,  y)  and  the  objective 
is  the  computation  of  a  particular  root  of  the  polynomial  F(z;  x,  y).  In  order  for  the  problem 
to  be  well-defined,  we  must  specify  which  one  of  the  n-l  roots  of  the  polynomial  is  to 
be  computed.  This  is  accomplished  as  follows.  We  fix  some  (x*,y*)  G  SJn  such  that  one 
of  the  roots  (call  it  z*)  of  the  polynomial  P(x;x*,y*)  is  real  and  simple.  This  root  will 
vary  continuously  and  will  remain  a  real  and  simple  root  as  z  and  y  vary  in  some  open  set 
containing  x*,  y*.  We  formulate  this  discussion  in  the  following  result. 
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Lemma  3.1  Suppose  that  z *  is  a  real  and  simple  root  of  F[z\x’ ,y*).  Then,  there  exist  open 
sets  Dz,  Dv  C  such  that  (x*,y*)  £  Dx  x  Dy  and  an  infinitely  differentiable  function 
f  :  Dz  x  Dv  *-*  SR  such  that  /(x*,y*)  =  z*  and 

F{f{*,vY,x,y)  =  0,  V(x,y)  £  Dz  x  Dv.  (3.2) 

Proof:  Notice  that  (z*;x*,y*)  ^  0,  since  z*  is  a  simple  root.  By  the  implicit  function 
theorem  ([S  65,  page  41]),  we  see  that  there  exists  an  open  set  D  containing  (x*,  y*)  and  an 
infinitely  differentiable  function  g  :  D  »-+  3?  such  that  y(x*,y*)  =  z*  and  F(g(x, y);x,y)  =  0 
for  all  (x,y)  €  D.  Now  by  the  continuity  of  $y(z; z,y)|I=J(ty)  at  the  point  (x',y*),  there 
exist  open  sets  Dz,  Dv  such  that  (x*,y*)  £  DzxDv  C  L»  and  such  that  |j(z;x,y)|i_tfj!C  ^  ^ 
0  for  all  (x,  y)  £  Dz  x  Dv.  As  a  result,  g(x,  y)  is  a  simple  root  of  the  polynomial  equation 
F(z;x,y)  =  0  for  all  (x,y)  €  Dz  x  Dv.  Let  /  be  the  restriction  of  g  on  Dz  x  Z?v.  Clearly,  / 
has  all  the  desired  properties.  Q.E.D. 

By  Lemma  3.1,  we  see  that  /(x,  y)  is  a  root  of  F{z\  x,  y)  and  is  a  well-defined  smooth  map 
from  Dz  x  Dv  to  X.  We  are  interested  in  the  communication  complexity  C\(f,Dt  x  Dy) 
of  computing  /(x, y)  as  (x,y)  varies  in  the  set  Dz  x  Dy.  We  start  by  pointing  out  that 
Abelson’s  lower  bound  (Theorem  1.1)  is  rather  weak. 

Lemma  3.2  The  rank  of  the  matrix  Hxy{f),  whose  ( i,j)~th  entry  is  equal  to  ,  is  at 

most  3,  for  any  (x,y)  €  Dz  x  Dv. 


Proof:  We  have 


n-1 


H(*.  +  y«)(/(*,  y)Y  =  o,  v(x,  y)  £  dz  x  dv. 

«= o 

We  differentiate  both  sides  of  the  above  equation, with  respect  to  ym,  to  obtain 

Y2  *(*•  +  #)(/(*>  v))’_1  •  +  (/(*» y))m  =  o,  V(x,  y)  £  D,  x  Dy,  0  <  m  <  n  -  1. 

(3.3) 


«=i 

We  differentiate  Eq.  (3.3)  further,  with  respect  to  xj,  to  obtain 

n-1 


E  -  *)(*< + + E  •(* + *>(/(*.»))'-' 


1=1 


+■»(/(«,  =  o. 

o*:  oym 


(3.4) 


Since  f(x,y)  is  a  simple  root,  it  follows  that  *(*i +lfc)(/(*>  y))‘-1  0.  Equation  (3.4) 

shows  that  is  of  the  form  ui(/)vi(m)  +  u2(/)vj(m)  +  uj(/)vs(m),  where  u<(/),v<(m) 
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are  some  real  numbers  depending  on  x,  y.  Therefore  the  rank  of  the  matrix  Hzv(f)  can  be 
at  most  3,  for  any  point  ( x,y )  €  Dx  x  Dv.  Q.E.D. 

We  now  illustrate  the  power  of  our  general  results,  by  deriving  a  lower  bound  that 
matches  the  obvious  upper  bound. 

Theorem  3.1  Let  Dz,  Dv  be  as  in  Lemma  9.1.  Then,  Ci(f{x,y)\Dz  x  Dv)  =  n. 

Proof:  The  upper  bound  Ci(/;  Dz  x  Dv)  <  n  is  obvious,  so  we  concentrate  on  the  proof  of 
the  lower  bound.  To  this  effect,  we  will  employ  Theorem  2.1  and  it  suffices  to  verify  that 
Assumption  2.1  holds  with  n/  =  m/  =  n.  Since  the  roots  of  a  polynomial  equation  cannot 
remain  constant  when  the  coefficients  vary  over  an  open  set,  it  follows  that  the  continuous 
function  /(*,  y)  given  by  Lemma  3.1  satisfies  parts  (a)  and  (b)  of  Assumption  2.1.  Now  we 
fix  some  y  €.  Dy  and  some  S  €  S  (y),  that  is,  S  C.  Dx  and  f(S,  y)  contains  an  open  interval. 
Let  ci,. . .  ,c„  be  some  distinct  real  numbers  in  f[S, y)  and  x1, ...  ,xn  €  S  such  that 

f(x,,y)  =  ci,  i  =  1, ...  ,n.  (3.5) 

Let  jc*.  be  the  j'-th  coordinate  of  x%.  Using  Eq.  (3.3),  we  see  that 

1 
Ci 


where  a,  =  +  j /j)cf  l.  If  we  form  a  matrix  whose  colums  are  the  vectors 

(l,Cj, . . .  ,  c"-1),  i  =  l,...,n,  this  matrix  is  a  Vandermonde  matrix  and  is  nonsingular, 
because  the  values  ci,...,c„  are  chosen  to  be  distinct.  Then,  Eq.  (3.6)  implies  that  the 
vectors  V„/( x*,y),  »  =  l,...,n,  are  linearly  independent.  This  proves  that  n /  =  n.  The 
proof  that  m/  =  n  is  similar.  Q.E.D. 

As  a  remark,  we  point  out  that  Theorem  3.1  is  in  some  sense  the  strongest  result 
possible.  The  only  assumptions  we  used  in  showing  Theorem  3.1  are  that  a)  the  message 
functions  are  continuously  differentiable;  b)  the  final  evaluation  function  is  a  continuous 
function;  c)  the  protocol  computes  a  root  of  a  polynomial  on  some  open  set.  As  discussed  in 
[L  89],  assumption  a)  is  necessary  since  its  removal  could  lead  to  unreasonable  conclusions. 
Assumption  b)  is  basic  and  natural  since  the  function  to  be  computed,  i.e.,  a  particular 
real  simple  root  of  some  polynomial,  is  continuous,  while  assumption  c)  is  minimal.  Finally, 
we  note  that  no  truly  two-way  communication  protocol  can  be  optimal.  In  other  words, 
if  each  processor  transmits  at  least  one  message,  then  at  least  n  +  1  messages  have  to  be 
exchanged.  This  is  a  simple  consequence  of  Corollary  2.1  of  Section  2. 


a»vv/(*,»y)  =  - 
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4  Comparison  With  Abelson’s  Bound 


In  the  previous  section,  we  have  seen  that  Theorem  2.1  can  yield  a  much  better  bound  than 
Abelson’s  result  (Theorem  1.1).  However,  it  is  not  true,  as  we  shall  see  next,  that  Theorem 
2.1  always  provides  a  stronger  lower  bound.  The  reason  is,  loosely  speaking,  that  our  result 
only  places  a  constraint  on  the  minimum  number  of  messages  that  has  to  to  be  sent  by  a 
single  processor,  while  Abelson’s  result  is  a  bound  on  the  total  number  of  messages  sent 
by  both  processors.  As  pointed  out  at  the  end  of  Section  2,  any  two-way  communication 
protocol  that  attains  the  lower  bound  in  Theorem  2.1  is  necessarily  an  one-way  protocol. 
Notice  that  our  result  makes  use  of  information  about  the  first  order  derivatives  of  function 
/.  This  is  in  contrast  to  Abelson’s  result  which  uses  only  the  second  order  derivatives  of  /. 
In  what  follows,  we  provide  an  example  where  Abelson’s  bound  is  more  effective  than  our 
bound. 

Example:  Let  / (z,  y)  =  xTQy,  where  Q  is  some  m  x  n  matrix  and  z  €  !Rm  and  y  €  3 fcn.  By 
Theorem  1.1,  we  see  that  C2{f\  5Rm  x  5Rn)  >  rank(Q).  Using  the  singular  value  decomposi¬ 
tion  of  Q,  one  can  construct  a  protocol  that  uses  exactly  rank(Q)  messages  (see  [LT  89]). 
Therefore,  we  conclude  that  Cj(/;  !Rm  x  !Rn)  =  rank(Q).  Next  we  apply  Theorem  2.1  to  /. 
To  this  effect,  we  need  to  find  out  of  the  values  m/  and  nj. 

Suppose  that  rank(Q)  =  r  >  0.  Let  Dx,  Dv  be  some  connected  open  subsets  of  3?m 
and  5Rn  respectively.  We  assume  that  0  &  Dx  and  0  0  Dv  in  which  case  /( i,  y)  is  non¬ 
constant  as  z  or  y  vary  in  an  open  subset  of  Dx  or  Dv>  respectivly.  Thus,  parts  (a)  and 
(b)  of  Assumption  2.1  are  satisfied.  We  now  show  that  Assumption  2.1  can  only  hold  with 
min{m/,  nj)  <  2.  By  the  singular  value  decomposition,  there  exist  two  linearly  independent 
families  of  vectors  ttj, . . . ,  ur  in  5Rm  and  vj, . . . ,  vr  in  5R",  such  that 

Q  —  uiv[  +  u2V[  -I- - (-  urvj.  (4.1) 

It  follows  that  xTQy  =  Y,i=i(uTx)(vTy)-  Since  r  >  0,  there  exists  some  point  (zo>  Vo)  £ 
Dz  x  Dv  such  that  x^Qy o  ^  0.  Hence,  we  can,  without  loss  of  generality,  assume  that 
(uJxo){vJyo)  #  0.  Let  S  =  (z  G  Dx  j  uj x  =  uf*o,  1  <  *  <  r  —  1}.  Clearly,  5  is  nonempty 
since  *o  €  S.  We  claim  that  if  r  >  1  then  f{S,  yo)  contains  an  open  interval.  In  fact, 
equation  (4.1)  shows  that 

*TQvo  =  i2(ui  x)(v? Vo) 

«= X 

=  X>f*o)Kr*>)  +  {'fx){'>Tvo),  Vz€S.  (4.2) 

«=i 

Since  ur  is  linearly  independent  from  u2, . . . ,  ur_i,  we  see  that  uj x  is  a  nonconstant  function 
of  z  on  5.  Using  the  fact  that  vfyo  ^  0  and  Eq.  (4.2),  we  see  that  xTQyo  i»  also  a 
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nonconstant  function  of  x  on  the  set  S.  Note  that  S  is  connected  because  Dx  is  assumed 
to  be  connected.  It  follows  that  /(5,  y o)  contains  an  open  interval.  To  see  that  nj  <  2,  we 
notice  that 

r— 1 

Vv/(*,yo)  =  ^2{UiX0)Vi  +  {ujx)v r,  Vx  £  S. 

1=1 

Hence,  dim[V„/(x,yo);x  €  S]  <  2.  Thus,  Assumption  2.1  can  only  hold  with  n/  <  2.  The 
relation  m/  <  2  can  be  established  in  a  symetrical  fashion.  As  a  result,  we  have  shown  that 
min{m/,n/}  <  2. 

Thus,  for  the  problem  /(x,  y)  =  xTQy,  Theorem  2.1  provides  a  lower  bound  of  at  most 
2  as  opposed  to  the  lower  bound  of  rank(^)  provided  by  Abelson’s  result.  Hence,  Theorem 
2.1  can  be  quite  far  from  optimal  in  general.  Furthermore,  the  above  example  and  the 
results  of  Section  3  demonstrate  that  Theorems  1.1  and  2.1  are  incomparable. 


A  Appendix 

This  appendix  contains  some  results  concerning  multivariable  functions  that  are  used  in 
Section  2. 

Let  F  :  U  x  V  ►-*  S'  be  a  continuously  differentiable  mapping,  where  U  and  V  are 
open  subsets  of  9Rr  and  SR*  respectively.  We  assume  that  r  >  s.  Let  (tt* ,  v')  €  U  X  V  be 
such  that  rank[VuF(u*,  v*)]  =  s.  Then,  the  matrix  VF#(u',v‘)  has  s  linearly  independent 
rows  and  we  can  find  a  set  J  C  {1, . . . ,  r}  of  indices,  of  cardinality  a,  such  that  the  vectors 
v*)>  •  •  •  >  fo*(u*>  v*))>  *  ^  J  are  linearly  independent.  We  define  the  projection 
II  :  SRr  *-»  SRr_*  by  letting  II(u)  be  the  vector  with  coordinates  u,,  i  0  J.  We  have  the 
following  lemma. 

Lemma  A.l  There  exists  a  connected  open  subset  R  of  U  xV ,  and  a  connected  open  set 
S  C  R,+<,  and  a  continuously  differentiable  function  g  :  S  *-»  R  such  that  (u*,v*)  £  R, 

S  =  {  {F(u,v),U{u),v)  j  (u,v)  €  R  }, 


and  such  that 

=  g(F(u,v),  II(u),v),  V(u,v)€i?.  (A.l) 

Proof:  Consider  the  mapping  q  :  U  x  V  SRr+t  defined  by  ?(u,v)  =  (F(u,  v),n(u),  v).  We 
claim  that  Vg(u*,v*)  has  full  rank.  To  see  this,  let  us  permute  the  rows  of  V$(u*,v*)  so 
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that  the  last  r  +  t-a  rows  correspond  to  the  partial  derivatives  with  respect  to  the  variables 
v  and  Uj,  i  &  J.  Then,  V?(u*,v*)  will  have  the  structure 

v?(“  )  =  B  j  . 

where  A,  B  are  suitable  submatrices  of  V F(u*,  t>*)  and  /  is  the  (r+ 1  -  a)  x  (r+t  -  s)  identity 
matrix.  Each  one  of  the  a  rows  of  matrix  A  is  a  vector  of  the  form  v*), . . . ,  f£*(u*,  v*)) , 

i  e  J ,  and  these  vectors  are  linearly  independent  by  construction.  Thus  det(Vg(u*,  v*))  = 
det(A)  ^  0.  The  result  then  follows  from  the  inverse  function  theorem  [S  65,  page  35]. 
Q.E.D. 

Theorem  A.l  Let  Q  be  an  open  aubeet  o/5Rr.  Let  F  :  Q  >-*  9t‘  be  a  continuously  differen¬ 
tiable  mapping  auch  that 

max  rank  (VF(z))  =  s.  (A.2) 

Suppose  that  f  :  Q  >-*  9i  is  a  continuously  differentiable  function  with  the  property 

V/(z)  e  span  (VF(z)},  Vz  e  Q. 

Then,  there  exists  some  continuously  differentiable  function  h  such  that  f(z)  =  h(F(z))  for 
all  z  €  R,  where  R  is  some  open  subset  of  Q. 

Proof:  Suppose  that  z*  E  Q  is  a  vector  at  which  the  maximum  in  Eq.  (A. 2)  is  attained. 
By  taking  t  =  0  and  dropping  the  set  V,  we  see  that  all  the  assumptions  of  Lemma  A.l 
are  satisfied4,  and  thus  Lemma  A.l  applies.  Let  R,  S  and  g  be  as  in  Lemma  A.l.  By 
assumption,  V/(z)  E  span  {VF(z)}>  Vz  E  R.  Thus,  for  every  z  E  R,  there  exists  a  vector 
d(z)  E  5 i*  such  that 

V/(z)  =  VF(z)<f(z),  Vz  €  R.  (A.3) 

Using  Lemma  A.l,  we  have 

F{z)  =  F(g(F(z),n{z))),  Vz  g  R, 

or 

u  =  F(?(u,  v)),  V(u,  v)  €  S.  (A.4) 

Let  V„$  be  the  (r  -  s)  x  r  matrix  of  the  partial  derivatives  of  g,  with  respect  to  the 
components  of  v.  Since  the  left  hand  side  of  Eq.  (A.4)  does  not  depend  on  v,  the  chain  rule 
yields 

0  =  Vvg(u,v)-VF(g(u,v)),  V(u,v)eS.  (A.5) 

*W«  have  assumed  that  r  >  »  here.  The  proof  for  the  caee  r  =  s  Is  essentially  the  same  except  that  n  is 
redundant. 
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We  use  Lemma  A.l  once  more  to  obtain 

f{z)  =  f{g(F(z),U{z))),  Vz  £  R. 

We  define  a  function  h  :  S  >-*  !R  by  letting 

=  f(g[u,v)),  V(u  ,v)€S.  (A.6) 

Notice  that  h  is  continuously  differentiable.  Using  the  chain  rule, 

Vvh{u,v)  =  Vvg(u,v)  •  V/($(u,v)),  V(u,v)  £  S, 

where  Vvh(u,  v)  is  the  vector  of  partial  derivatives  of  h  with  respect  to  the  components  of 
v.  Using  (A. 3)  and  (A. 5),  we  conclude  that  V„&(u,t>)  =  0,  for  all  (u,  v)  £  S.  Since  S  is 
open  and  connected,  it  is  easily  shown  that  h  is  independent  of  v  and  there  exists  a  function 
h  :  V  >-*  Si  such  that 

h(u,v)  =  h(u),  V(u,v)  €  S. 

Here  V  =  F(R)  which  is  obviously  open  and  connected.  For  any  z  £  R,  we  have 

f(z)  =  f{g{F(z),  n(*)))  =  ^(F(z),n(z))  =  A(F(z)) , 

as  desired.  Q.E.D. 

Theorem  A. 2  Let  F  :  Q  >-*  8?*  be  continuously  differentiable,  where  Q  C  5Rr  is  open.  We 
assume  that  rank(VF(z})  <  s,  Vz  €  Q,  and  that  VFi(z)  (the  gradient  of  the  first  component 
of  F)  is  not  identically  equal  to  zero  on  the  set  Q.  Then,  there  exists  some  positive  integer 
i  and  some  continuously  differentiable  function  g  such  that 

Fi+1(*)  =  9(Fi(z) . Fi{z)),  Vz  e  R, 

where  R  is  some  nonempty  open  subset  of  Q  and  Fi  denotes  the  i-th  component  mapping  of 

F. 

Proof:  We  let  i  be  the  largest  index  such  that  there  exists  some  z  £  Q  with  the  property 

dim  span{VFi(z),.. VFi(z)}  =  i. 

Clearly,  1  <  *  <  *.  By  continuity,  there  exists  some  open  subset  Q  of  Q  containing  z  such 
that  V-Fi(z), . . . ,  VF,(z)  are  linearly  independent  for  all  z  €  Q.  By  our  choice  of  the  index 
t,  we  have 

VFi+l(z)  6  span {VFx{z), ....  VF(z)},  Vz  €  Q. 

By  Theorem  A.l,  we  see  that  there  exists  a  continuously  differentiable  function  h  :  U 
such  that 

Fi+1{z)  =  h(F1{z),...,Fi(z))>  Vzef? 
where  R  is  some  open  subset  of  Q  and  V  =  F(R).  Q.E.D. 


16 


r 


Theorem  A. 3  Let  F  :  U  x  V  <-*  tit*  be  a  continuously  differentiable  mapping,  where  U 
and  V  are  open  subsets  of  5Rr  and  SR*  respectively.  Let  (u*,v*)  E  U  x  V  be  such  that 
rank[VuF(u*,t/*)]  =  s  and  F(u’,v‘)  =  0.  Then,  there  exist  some  nonempty  open  sets 
W  cU,V  cv  such  that  u’  EW,v'  eV  and 

{  u  1  F(u,v)  =  0  }f|W 

ta  nonempty  and  connected  for  all  v  E  V .  Furthermore,  t/{t><  E  V\i  =  1,2, . . .}  is  a  sequence 
of  vectors  such  that  linv-,00  =  v  and  v  €  V,  then  there  exists  a  sequence  {u,  €  W)  such 
that  F(u,,  =  0  and  lim,_oo  =  u  for  some  u  E  W . 

Proof:  We  are  in  a  situation  where  the  assumptions  of  Lemma  A.l  hold5.  Let  q,  g ,  R,  S 
be  given  as  in  Lemma  A.l.  Thus,  (u,  v)  =  $(?{u,  v))  —  g[F{u,  t>),  II(u),  v),  for  all  (u,  u)  E  R. 
Let  gu,  gv  be  the  corresponding  component  mappings  of  g  such  that  u  =  gu[q[u,v))  and 
v  =  gv(q{u,  v)).  Since  S  is  open,  we  can  take  a  connected  open  subset  of  5  of  the  form 
W\  x  W2  x  V  such  that  W2  c  SR',  W2  c  5Rr“*  and  g(u*,v*)  E  W2  x  W2  x  V.  It  is  easy  to 
check  that  W2  is  nonempty  and  connected  and  that  v*  E  V.  Since  g  is  a  diffeomorphism,  it 
follows  that  the  set  g[W2  x  W2  xV)  is  open.  Moreover,  we  claim  that  g  has  the  following 
properties: 

(1)  gv(wi,Wi,v)  =  v,  for  all  (w1,w2,v)  E  Wi  X  W2  X  V; 

(2)  U[gu(wi,w2,v))  =  w2,  for  all  [wi,w2,v)  E  W\  x  W2  x  V. 

To  prove  the  first  property,  let  us  write  (wi,W2,v)  =  g(u,u')  for  some  (u,  v')  E  R.  This 
is  possible  since  (wi,w2,v)  E  S.  Hence,  (wi,W2>v)  =  (F(u,v'),II(u),t/).  It  follows  that 
v  =  v'  and  (tvi,w2,v)  =  q(u,v).  Thus,  gv(w i,w2,v)  =  gv(q[u,v))  =  v,  which  proves  (1). 
We  now  show  the  second  property.  As  we  have  just  seen,  there  exists  some  u  such  that 
(wi,u>2,t>)  =  q(u,v)  and  (u,v)  E  R.  Thus,  (ti>i,u>2,v)  =  (F(u, v),Il(u), v),  from  which 
follows  that  w2  =  Il(u).  On  the  other  hand,  we  have 

n(0U(u>i,«>2»tO)  =  n(p„(g(u,v)))  =  n(u), 

from  which  follows  that  u>2  =  n(pu(wi,W2,v)). 

Now  let  W  =  gu{W\  x^xV)  and  Su(v)  =  {  u  €  U  |  F(u,v)  =  0  }.  Since  W  is  the 
projection  of  the  open  set  g(W2  x  W2  xV),it  follows  that  W  is  open  in  If.  Also,  it  can  be 
easily  seen  that  W  cU  and  u*  €  W.  Furthermore,  we  claim  that 

5„(v)f]W  =  {  gu(0,vj2,v)  |  w2eW2},VveV.  (A.7) 

‘Here  we  have  aeeomed  that  r  >  s.  The  tame  argument  work*  for  the  case  r  =  »  except  that  II  ahoold 
be  dropped  in  the  remaining  proof. 


17 


In  fact,  let  us  fix  some  v  €  V  and  let  E(v)  be  the  set  in  the  right-hand  side  of  Eq.  (A.7). 
We  will  show  that  E(v )  C  Su(v)  f\W .  Clearly,  E[v)  c  W.  Thus,  we  only  need  to  show 
that  E(v)  C  5u(t>).  Let  u  be  an  element  of  E(v).  Then,  there  exists  some  u>2  £  W2 
such  that  u  =  <7u(0,u>2, v).  Since  ?(u*,u*)  =  (F(u*,t>*),n(ti*),  v*)  =  (0,II(u*),v*)  and 
q( u*,t>*)  €  Wi  x  W2  x  V,  we  see  that  0  £  W\.  Thus,  (0,tV2,v)  £  W2  x  W2  x  V.  In  light  of 
property  (l),  we  see  that  v  =  gv(0,w2,v).  Consequently, 

F(u,t>)  =  F  (gu{0,w2,v),gv{Q,w2tv))  =  F(g(0,w2>  v))  =  0. 

It  follows  that  E(v )  C  5u(v)f|^. 

For  the  reverse  inclusion,  given  any  u  £  Su(v)  P|  W,  we  have  F(u,  v)  =  0.  Furthermore, 
there  exists  some  (tuj,tu2,v')  €  W2  x  W2  x  V  such  that  u  =  gu{w i,u>2,v').  By  property 
(2),  we  see  that  II(u)  =  w2.  Thus,  (0,  to2,v)  =  (F(ti,  i>),  n(u),  v)  =  g(u,  v).  Hence,  u  = 
tfo(9(u>v))  =  So(0,tt)2,t;).  This  implies  that  u  £  E(v),  and  Eq.  (A.7)  has  been  established. 
As  a  result,  the  set  Su(v)n^  is  connected  because,  according  to  (A.7),  it  is  the  image  of 
the  connected  set  W2  under  a  continuous  mapping.  Since  E(v)  is  nonempty  for  each  ogV', 
Eq.  (A.7)  also  shows  that  Su(v )  f|  W  is  nonempty. 

Given  a  sequence  of  vectors  {v<  €  V;  t  =  1,2, . . .}  such  that  lim^oo  Vj  =  v  and  v  £  V, 
let  us  pick  u,  =  </u(0, w2, v,),  «  =  1,2,...,  where  u>2  is  some  fixed  vector  in  W .  Hence, 
u,-  €  E(vi),  for  all  t.  According  to  Eq.  (A.7),  we  see  that  F(u,,  v,)  =  0.  Furthermore,  by 
the  continuity  of  gu,  we  see  that 

lim  u,  =  lim  gu{0,v>2,  v<)  =  pu(0,u/2,t>)> 

l-»00  I --♦oo 

which  is  clearly  in  W.  Q.E.D. 

Theorem  A.4  Let  Q  be  an  open  set  »n  51*.  Let  also  F  :  Q  5f*  be  a  continuously 
differentiable  mapping  such  that 

rank  (VF(z))  =  s,  Vz  €  A,  (A.8) 

where  A  =  {  z  |  F(z)  =  0  }.  Suppose  that  f  :  Q  >-*  R  is  continuously  differentiable  and  is  a 
constant  function  on  A.  Then, 

Vf(z)  £  span  (VF(z)},  Vz  €  A.  (A.9) 

Proof:  Consider  the  following  constrained  optimization  problem: 

nun  /(z).  (A.10) 

IcA 

By  assumption,  each  z  in  A  is  an  optimal  solution  to  (A.  10).  Since  the  regularity  condition 
(Eq.  (A.8))  ensures  the  existence  of  a  set  of  Lagrange  multipliers,  the  necessary  condition 
for  optimality  gives  the  desired  result  ([L  84,  page  300]).  Q.E.D. 
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